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Abstract: Distinguishing quark-initiated jets from gluon- initiated jets has the potential 
to significantly improve the reach of many beyond-the-standard model searches at the Large 
Hadron Collider and to provide additional tests of QCD. To explore whether quark and 
gluon jets could possibly be distinguished on an event-by-event basis, we perform a com- 
prehensive simulation-based study. We explore a variety of motivated and unmotivated 
variables with a semi-automated multivariate approach. General conclusions are that at 
50% quark jet acceptance efficiency, around 80%-90% of gluon jets can be rejected. Some 
benefit is gained by combining variables. Different event generators are compared, as are 
the effects of using only charged tracks to avoid pileup. Additional information, includ- 
ing interactive distributions of most variables and their cut efficiencies, can be found at 
http : // j ets . physics . harvard . edu/qvg. 



Contents 

1 Introduction 2 

2 History and Future of Quark/Gluon Measurements 4 

3 Theoretical Considerations 6 

4 Event Generation 10 

5 Overview of Observables 11 

6 Evaluation of Discrimination Power 14 

7 Discrete (Particle/Track/Subjet) Variable Results 16 

8 Jet Shapes and Geometric Moments 19 

8.1 Jet Mass 19 

8.2 Traditional Jet Shape 19 

8.3 Radial Geometric Moments 21 

8.4 Linear Radial Geometric Moment: Girth, Width, or Jet Broadening 21 

8.5 Jet Angularities 22 

8.6 Optimal Kernel for Radial Moment 23 

8.7 N-subjettiness 24 

8.8 Two-Point Moment 26 

8.9 Two-Dimensional Geometric Moments 27 

8.10 Pull 28 

9 Combining Variables 29 

10 Comparing Variables 30 

11 Using Impure Samples to Verify Underlying Pure Distributions 35 

12 Choosing the Operating Point for a Mixed Background 36 

13 Comparing HerwigH — \- to Pythia8 39 

14 Conclusions 41 



- 1 - 




Figure 1. Gluino decay as an of of a quark- heavy signal, in this case with 8 quark jets and no 
gluon jets produced. Multi-jet events in standard model backgrounds are extremely unlikely to have 
so many quark jets. 



1 Introduction 

Being able to distinguish quark-initiated from gluon-initiated jets reliably at the LHC could 
be fantastically useful, since signatures of beyond-the-standard-model physics are often quark 
heavy. For example, a typical gluino-pair production topology is pictured in Figure 1. Pro- 
duced in pairs, each gluino's cascade decay can produce four quarks and missing transverse 
momentum due to the escape of the lightest supersymmetric partner. Backgrounds to this 
process have events with many jets produced from QCD. These jets are predominately glu- 
onic. Additionally, many imparity violating SUSY models produce quark jets without the 
missing transverse momentum. To constrain these models, being able to filter out background 
QCD events containing gluon jets would be helpful. Leptophobic Z' or W' particles provide 
other obvious examples where quark/gluon discrimination would be useful. 

Gluon-heavy backgrounds are especially problematic for signals without leptons, gauge 
bosons, i?-jets, tops, or missing energy. Quark/gluon tagging might be one of the few ways 
to improve these searches. Another application is to reduce reduce combinatorial ambiguity 
within a single event. If jets in a given event could be identified as quark or gluon, their 
place in a proposed decay topology could be constrained, or they could be classified as initial- 
state radiation. Examining the quark/gluon tagging scores of jets produced by a new particle 
might be the only way to measure QCD quantum numbers directly. Alternatively, some 
signals consist of gluon jets, like coloron models [1] or buried-Higgs, where h — > 2a — > Ag 
and a is CP odd scalar [2]. The same observables and techniques apply to gluon tagging, 
though here we will treat the quark jets as the signal and the gluon jets as background for 
concreteness. 
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Figure 2. Fraction of jets which are a light quark jet (up, down or strange) rather than a gluon jet 
Here all jets have the minimum px cut indicated, but photons have a minimum px of only 20 GeV. 

Practical quark/gluon discrimination would also be useful for some standard model stud- 
ies. For example, in vector boson fusion (VBF), the forward jets are always quark jets whereas 
in non-electroweak backgrounds to VBF, the jets near the beams are often gluonic. In the 
standard model, as the px of jets increases, or if they are produced along with an electroweak 
boson, the fraction becomes more quark-heavy. This is shown in Figure 2. Thus, knowing 
the quark-to-gluon jet fraction of an event can help determine what are the underlying hard 
partons, with applications even in the standard model. 

Differences between quark and gluon jets were measured in great detail in LEP 3-jet 
events, where the flavor could be known to high accuracy. Such measurements are de- 
scribed well by perturbative QCD calculations and leading-log parton showers combined with 
hadronization models. In the LHC era, we propose using this accumulated wisdom as a tool 
to find new physics. The small differences between generators do not invalidate the use of 
these tools to find observables that can distinguish between quark- and gluon-initiated jets 
on a jet-by-jet basis. Experimental effort can then be focused on the small set of the most 
powerful discriminators. 

The goal of this hadron-level Monte Carlo study is to find properties of jets that best 
distinguish ones initiated by a quark from those initiated by a gluon. Charged particle 
count and jet mass are well known examples, but new observables like pr-weighted moments 
as measured from the jet center and subjet properties provide additional handles. Each 
observable is examined for many jet pts, and the best set is combined into a quark/gluon 
tagger. Given a jet of a particular pt, our tagger assigns a quark/gluon likelihood score. This 
can be cut on to purify the flavor content, combined with prior quark/gluon fraction into a 
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true probability, or used in conjunction with b and r tagging scores to more fully classify the 
jet's flavor. We do not expect quark/gluon tagging to reach the same power as .B-tagging or 
r-tagging, but our quark-efficiency vs gluon rejection curve can serve as a first approximation 
of what is achievable. 

The main result of this paper is that a small set of 1-3 input observables capture nearly 
all of the quark/gluon differences. The most useful variables could become the focus of theo- 
retical study, experimental measurement, and Monte Carlo validation. The multidimensional 
distributions of quark/gluon discriminating variables can be experimentally verified, for ex- 
ample, by looking at many samples with different known quark/gluon compositions, especially 
ones that are relatively pure [3]. 

Since jet properties depend strongly on pr, We examine jets in narrow pj- windows 
around six central values between 50 GeV and 1.6 TeV, in powers of 2. As a result of our 
examination of so many observables, we can make general statements about some pt trends. 
For example, track counts are more useful at high pr, whereas geometric moments (which 
measure the width/girth of jets) are more useful at low pr- In addition, some observables are 
more powerful discriminants when the operating point of the tagger is chosen at high quark 
efficiency, and others are useful when a stronger cut is used to achieve high quark purity. 

In the next section, we review past calculations and collider measurements at LEP and 
the Tevatron. After that, we define our observables, show hadron-level distributions, and 
quantify their performance. Finally we combine observables using boosted decision trees to 
form a multivariate discriminant. The final sections include comments on how one might use 
a quark/gluon tagger in situations where the signal or background contains both quark and 
gluon jets. 

2 History and Future of Quark/Gluon Measurements 

There are several differences between quarks and gluons that prove useful in motivating 
observables that can distinguish between the jets initiated by quarks as compared to gluons. 
Below is a list of properties and observables they motivate: 

• Color Charge: Cf vs Ca — > jet mass, girth/width, track count 

• Color Connections: 1 vs 2 — > eccentricity, planar flow, and pull 

• Electrical Charge — > charge-weighted track px 

• Spin: 1/2 vs 1 — > correlations in the location of subjets 

An excellent review of theoretical and experimental results as of 2003 are presented in [4] , 
some of which we now summarize. LEP studied the difference between quark and gluon jets 
by looking at 3-jet events. These correspond to e + e~ — > qqg at the parton level. At high 
center-of-mass energy, the two hardest jets are quark-initiated 99% of the time, thus one can 
use energy to select a pure sample. In another selection method, the highest energy jet is 
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assumed to be a quark jet and one of the other jets is tagged for heavy flavor, which indicates 
the third should be gluonic. Alternatively, for three jets of similar energy, two -B-tags gave a 
clean sample of ~30 GeV gluon jets. 

LEP measured the ratio of the number of particles in gluon vs quark jets. The average 
multiplicity of any type of particle, along with its variance are given by the semi-classical 
approximation 

(NI = Ca *1 = Ca 

(N q ) C F a\ C F { ■ } 

where Ca/Cf = 9/4. The angular width of the jet, using Sterman- Weinberg definition, is to 
leading order 

5 g = .SJVCU . (2.2) 

An intuitive explanation for these results is that a quark jet is dominated by the first gluon 
emission, at which point it continues to shower like a gluon jet. Since gluon jets have more 
particles, for a given energy they will have correspondingly fewer hard particles. 

In cases where QCD estimates do not agree with full simulation or with data, the reason is 
often attributed to energy conservation not being taken into account in each splitting. Since 
shower Monte Carlos enforce this energy conservation, they often have better agreement 
with data than the analytic estimates. Multiplicities have been calculated, including energy- 
momentum conservation, at N 3 LO [5]. At LEP I energies, the result was (N g )/(N q ) ~ 
1.7. OPAL [6] studied the charged particle multiplicity in light quark jets of average energy 
45.6 GeV and gluon jets of 41.8 GeV. Agreement in the moments (mean, width, skewness, 
kurtosis) of the particle-count distributions was found to agree with the Monte Carlo event 
generators and with analytic predictions. 

Subjet multiplicities were also examined at LEP for various subjet sizes [7, 8]. Extremely 
small subjets (/cr=0.1 GeV) approach the limit of particles, and therefore probed hadroniza- 
tion. But larger subjets (fcy=5 GeV) probed the better modeled, perturbative physics and 
gave the largest ratio between quark and gluon subjet multiplicities. For the first study cited, 
the average energy of the quark jets was 32 GeV, while that for gluon jets was 28 GeV. Later 
in this paper, we show that smaller subjets always improved quark/gluon discrimination at 
the LHC, down to the smallest subjets we probed with a resolution-limited size of i? su b ~ 0.1. 

The particle types identified within jets also differ between quarks and gluons. For 
example, the numbers of K°, A, tt^, K^, p, rj, rf, and ir° particles have been studied. LEP 
found an increase in baryons (protons and Lambdas) for gluon jets and an increase in kaons for 
quark jets. This is reviewed in [4], where table 12.1 lists results of many LEP experiments. 
Some of the most relevant include DELPHI [9, 10], and OPAL [11, 12] measurements of 
identified particle ratios. We do not consider variables based on particle type, since their use 
depends strongly on how well these can be experimentally measured. 

LEP studies found 5-jets to be more similar to gluon jets than to light quark jets [13, 14]. 
The number of particles was higher in B-jets than in light quark jets, as was the angular 
spread. Both of these effects are due to the longer decay chain of i?-hadrons, which overwhelms 
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the effect of perturbative partem shower. The peculiarity of U-jets should be lessened in the 
LHC, which has higher px jets and more boosted £>-hadrons: for higher px, the QCD shower 
produces more particles, whereas the particle multiplicity is relatively fixed in the .B-hadron 
decay. When the decay products of a i?-hadron are specifically removed from consideration, 
the properties of -B-jets will again look more quark-like. 

Importantly, l?-taggers rely largely on impact parameters or a secondary vertex, and so 
their efficiency should be largely uncorrelated with the observables we consider, which are 
constructed from the momenta of the particles. Thus, .B-mistag rates are not significantly 
different for quarks as compared to gluons. One can imagine a 3-dimensional tagger based 
on the probability that a jet is either 6, quark or gluon initiated. Since .B-tagging is very 
dependent on experimental issues, we do not attempt to consider such a tagger here. We note 
that gluon splittings to heavy flavor, g — > bb, are included in our simulation of gluon jets. 

Compared to LEP studies using 25 to 45 GeV jets, the LHC will typically have higher 
energy jets. Since high-energy jets are of particular interest to new physics searches, we 
consider jets with energies up to 1.6 TeV in this study. At high energy, it is helpful to use 
longitudinally boost-invariant measures like transverse momentum and rapidity as opposed 
to energy and angle. Sometimes this motivates new variables appropriate to hadron colliders 
by replacing the LEP variable E by px and 6 by r = y Ay 2 + A(j) 2 . 

Measures of the angular width of jets were used in CDF to reject gluons and purify fully- 
hadronic top-quark samples [15]. This may have been the first experimental application and 
proof that a separation exists in a complex hadronic collider environment. This study showed 
that quark and gluon jets can be calibrated in a "naturally pure" quark sample (semileptonic 
tops without any explicit quark/gluon tagging). 

Quark/gluon tagging should be even more useful at the LHC than it has been at LEP 
or the Tevatron. Compared to CDF and D0, ATLAS and CMS have better tracking and 
calorimeters, with spatial resolutions up to lOx as high. CMS's particle flow and ATLAS's 
individually calibrated TopoClusters give jet substructure techniques new power (especially 
if associated with the primary vertex and corrected for magnetic field bending). Also the 
LHC's proton-proton initial state, higher energy, and higher luminosity make gluon jets more 
common and more new physics signals are buried under multi-jet events. In addition, we find 
that higher px jets of the LHC are more taggable than lower px jets of previous colliders. For 
example, the charged track count becomes a better indicator of flavor as the jet px increases. 

3 Theoretical Considerations 

Before cataloging and evaluating jet observables, it is worth commenting on the extent to 
which jet flavor is well-defined. We will argue that in the case of well-separated jets, ap- 
propriate for kinematic reconstruction, each jet can be assigned an unambiguous flavor. In 
other words, any situation which is problematic for quark/gluon tagging is also problematic 
for kinematic reconstruction. Thus, quark/gluon tagging is no more poorly defined than 
reconstructing a decay chain or other short-distance interpretation of an event with jets. 
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Figure 3. Jets are formed by grouping together collinear radiation. 



In the parton shower picture (which is in excellent agreement with data) a hard par- 
ton, well-separated from other hard partons in the event, undergoes showering that produces 
nearly collinear radiation. An example is illustrated in Figure 3. With reasonable assump- 
tions about hadronization, any infrared and collinear-safe jet algorithm returns jets whose 
momenta correspond in some way to the initial hard partons. This parton/jet correspondence 
is implicitly assumed in all searches which use the resulting 4-vectors to reconstruct heavy 
objects like W bosons, Higgs bosons, and tops. Violating any of these assumptions erodes 
the parton/jet correspondence. 

For example, the shower products from two nearby hard partons could significantly over- 
lap. Depending on the jet algorithm used, the jets might merge or have strange shapes. In 
such a case, the resulting jet momenta might not be useful for kinematic reconstruction and 
the jet properties (charged particle count or mass) might not be distributed in a way that 
corresponds to isolated quark or gluon jets. 

One cause of unease is a sense that NLO quantum effects invalidate the semiclassical 
parton-shower picture. Much of the NLO corrections comes from real emission diagrams. 
At the quantum level, there is indeed interference between diagrams with the same final 
particle flavor and momenta. This is illustrated in Figure 4 where in one diagram a collinear 
gluon emission affects the properties of unambiguously quark-initiated jets, whereas the other 
diagram is a quantum mechanically indistinguishable correction where the gluons come from a 
completely unrelated additional hard parton. Looking only at the flavor and momenta of the 
final state, one might be uncomfortable claiming the configuration corresponds to two quark 
jets. However, the parton-shower-like, nearly collinear diagram has a much larger amplitude 
and therefore the uncertainty on labeling the configuration as having quark jets is small. 
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Figure 4. Parton showers produce quark jets whose properties are largely determined by the emitted 
gluons, as indicated in the left diagram. On the right, the same configuration is produced when a third 
hard parton, in this case a gluon, splits into two gluons with momenta equal to the showered gluons. 
Since the two amplitudes interfere, it might not make sense to describe this final state configuration as 
having two quark jets. In this case, however, the amplitude for the shower diagram is much larger than 
the hard-gluon-splitting diagram for the same final-state kinematics. In fact, as the gluons become 
more collinear with the quarks, the first amplitude is divergent. 

Up to an overall normalization, much of the NLO effects are reproduced by including 
matrix element corrections merged with a parton shower. In a fully- matched sample (using 
CKKW[16] or MLM[17] for example), each jet comes unambiguously from exactly one hard 
parton, and the flavor of this parton is known. The matching procedures have some merging 
scale, on which the final distributions depend only weakly. Thus one can make the same 
conclusion about matching for quark and gluon discrimination as for almost any other appli- 
cation (such as kinematic reconstruction): it gives unambiguous answers when the final state 
contains clearly separated jets. In ambiguous final states, which can be explicitly avoided, 
there is no well-defined underlying parton topology relevant for any analysis. 

Ambiguities are always present in event-reconstruction from jets: fully hadronic ti decay 
doesn't always produce six clean, well-separated jets with unambiguous correspondence to 
b and W decay products. Thus, the problem is no worse for quark/gluon tagging than for 
top-reconstruction. Secondly, the mixing effect is numerically quite small. It is of course 
suppressed by a factor of a s . But also, hard splittings which change quarks to gluons or vice 
versa are power suppressed, for example by m^t/E^t. NLO ambiguities are important for 
measurements like the inclusive jet cross sections, but the bottom line is that NLO effects do 
not prevent a quark/gluon tagger from being a practical tool for many new physics searches. 

For additional justification, we point out that 5-tagging is at least as ambiguous as 
quark/gluon-tagging, and has been well-proven to be useful. For l?-tagging, there are many 
simple, leading-order -B-jet definitions. For example, -B-jets can be defined as jets that contain 
a S-hadron among its decay products. This does not mean that a perfectly accurately B- 
tagged jet has a momentum that corresponds to the initiating b quark. For example, an event 
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with a Z decaying to b and b quarks doesn't necessarily produce a pair of i?-tagged jets that 
have an invariant mass that corresponds to the Z. Away from the mass peak, either the 
wrong jets contained the S-hadron or there is simply no single jet that corresponds to each 
b quark. At the Monte Carlo level, the hard 6-quarks won't correspond to i?-tagged jets any 
better than a W boson's direct quark decay products will correspond to two jets that should 
obviously be labeled the quark jets. 

An alternative to using the event record in a Monte-Carlo to extract truth information 
about the hard parton initiating a jet would be to cluster the partons in the jet such that 
flavor information is retained. An algorithm for doing so was proposed by Banfi, Salam, and 
Zanderighi in [18]. Their idea was to count the number of quarks minus anti-quarks in a jet. 
By itself, this would not be a good infrared and collinear-safe definition. But they modified 
the kx jet-clustering algorithm to combine only partons that preserve flavor in an infrared and 
collinear-safe way. Gluons can be combined with u-quarks to make a a u-jet, u and u can be 
combined into a flavorless gluon-jet, but u and d-quarks cannot be combined. The focus of [18] 
was on precision calculations in perturbative QCD with a small number of partons involved. 
The applications of quark/gluon tagging at hadron colliders are somewhat different. Since 
the observables at colliders are tracks and calorimeter deposits from color-neutral hadrons, 
a parton-level jet-flavor algorithm like the one in [18] is not directly applicable. The exact 
quark minus anti-quark count is not reliably observable, nor does it directly capture the useful 
but vague notion that a particular jet 'was initiated by' a particular quark or gluon. The 
algorithm in [18] could be used on the pre-hadronization partons in a Monte Carlo event 
record to assign a truth-flavor in a non-matched sample. But if the relevant hard partons are 
available in the event record, one might as well use them for the truth information, since this 
corresponds exactly to what one is trying to extract from the event. 

To verify the distributions of the variables discussed below, samples of known flavor- 
composition can be used. For example, in a jjj event when the softer jet is near enough to 
the photon, it is over 98% likely to be a quark jet. (This can be understood from the simple 
observation that quarks radiate photons but gluons do not.) A catalog of high cross-section 
processes and kinematic cuts which can be used to purify samples was given in [3]. The 
fraction of a cross section consisting of quark or gluon jets is in fact well-defined beyond the 
leading order in perturbation theory as long as the jets are hard and well-separated. This 
follows essentially because helicity is conserved in the collinear limit, as discussed in detail in 
[3]. 

Jets from the pure samples discussed above might not be representative of jets in a signal 
or background of interest. To predict the jet properties of a new signal, simulations must 
be employed at some level, and there are problems assigning truth-level flavor to these jets. 
One popular method is to 'match' the jets to the hard process using their AR. While this is 
common, it doesn't take into account how well the energies match. It is also not guaranteed, 
for example, that the 4-momenta of hard partons from MadGraph are preserved when Pythia 
adds initial state radiation and has to rebalance the event. This procedure can only be trusted 
in a matched sample, where the hardest jets have explicit matrix-level counterparts. Another 
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method is to examine the shower history, which can be used to assign a jet a truth tag if a 
large enough fraction of its energy is 'descended' from a single hard parton. Changing the 
definition of 'large enough' might alter the distribution of the jet properties of interest. But 
a larger problem is that soft radiation is really a dipole effect, sourced by two hard partons, 
whereas Pythia randomly assigns this soft radiation to either one or the other parent. 

4 Event Generation 

Most of the results in this paper pertain to generated with madgraph V4.4.26 [19] and 
showered through pythia V8.140 [20] with most recent default tune. We also compare to 
the same events showered with Herwig++ 2.5.2 [21]. Jets are reconstructed using fastjet 
V2.4.2 [22]. The multivariate analysis is done using the tmva v4.0.4 package [23] that comes 
with root v5.27.02 [24]. 

No detector simulation was done. Instead, we discard charged particles with momenta 
less than 500 MeV. These particles are not allowed to contribute to either the construction 
of jets or the observables involving charged tracks. This 500 MeV cut is identical to early 
ATLAS studies [25] (later studies have raised the cutoff to 1 GeV [26] ) . With data and better 
tunes, a full detector simulation (not publicly available) will become necessary to validate the 
various variables. 

Since experiments also compare to Monte Carlo truth-hadrons, our study provides a useful 
rendezvous point. The goal of this paper is to point out potentially interesting observables, 
some new, which might either be used right away or studied in greater detail. To that end, we 
have made an effort to find observables that depend more on the perturbative parton shower 
than on hadronization. No effort has been made to explicitly consider multiple interactions, 
though they are included in the underlying event model. Pileup, however, is not explicitly 
included since removing it is best studied with a full detector simulation. 

We start from a dijet sample pp — > jj with the jets in each sample having their pr in 
windows centered around values spaced by factors of 2 in GeV: (50, 100, 200, 400, 800, 1600). 
We also considered a back-to-back 7 + jet sample for the same jet pt$, and the results were 
nearly identical. 

The shape of the quickly falling px distribution within each window affects the efficiencies 
of various variables. For example, the pt of a jet depends on the jet algorithm and jet size; this 
dependance is precisely one of the variables studied here that usefully distinguishes quarks 
from gluons. That the cross sections fall sharply with px makes the initial sample selections 
quite delicate. Ideally we would simulate a 'natural' dijet pt distribution and select jets only 
within an infinitesimal window around each px- With a narrow enough window, the falling 
distribution can have a negligible effect. But the shower, hadronization, and jet algorithms 
must all be run before this determination can be made, so an extremely narrow window is 
computationally inefficient. 

To deal with the rapidly falling distributions, we chose parton-level MadGraph cuts to 
reproduce samples with 'natural' anti-/cr R = 0.5 jet px distributions within a ±10% window, 
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starting with the most narrow parton px window that was possible. Even when quark and 
gluon partons start with an identical initial px, after showering, quark jets had to higher 
average pt than gluon jets. This was compensated by shifting and widening the initial 
parton-level pt windows to efficiently generate a representative distribution of jet pxs within 
the narrower jet window. The gluon and quark parton pr windows were chosen to have 
a ±20% width and were shifted relative to each other to align the center of their anti-A;^ 
R = 0.5 jet pt distributions on the nominal values. The resulting px distributions still aren't 
exactly identical: the gluon has more of a lower tail, and the quark distribution has an upper- 
tail. So only jets within ±10% of the nominal px were kept. Additionally, for dijets, the 
Pt distribution of the gluons falls faster than that for quarks (as can be inferred from the 
changing fractions in Figure 2), but for our narrow final jet px window, this slope difference 
is negligible. Our shifting and spreading successfully decoupled jet px from the jet properties 
while maintaining somewhat efficient event generation. This prevents our tagger from picking 
up on the difference in px distribution of the input samples rather than the jet properties. 

This pre-shift and post-window procedure above slightly biases the sample for a finite 
width. Any 'real' set of jets at a particular px will include some whose underlying parton was 
much softer, and others where it was much harder. A 'natural' px distribution, especially for a 
QCD background, is exponentially falling, which means that jets at a particular px will more 
likely come from softer partons that showered-up rather than harder partons that showered- 
down. With these caveats, the best advice is to take our scores as a rough guide, focus on 
ones that don't change drastically with jet px, and train any multivariate discriminant either 
bin- by-bin in px, or on the actual underlying px distribution of your signal and background 
samples. To construct a general gluon tagger, the experiments will need a canonical jet 
definition so they can train it on a set quark and gluon jets with identical px distributions 
with respect to that jet definition. Some anti-fcr R=0.5 px distributions within our windows 
are shown in Figure 5. 

5 Overview of Observables 

For the purposes of quark/gluon tagging, a jet can be thought of a set of particles, tracks, or 
calorimeter deposits. Each constituent has a 4-momentum and possibly a charge or particle 
ID, though this is difficult to determine. Given this huge set of constituent data, the goal is 
to estimate the likelihood that the jet was initiated from a quark rather than a gluon. 

If a jet is made of 100 constituents, each with a massless 4-momentum, the problem 
is 300-dimensional. Ideally we'd have a fully differential cross section: a 300-dimensional 
probability density for quark jets, and one for gluon jets. This would take into account 
important correlations when reducing the list of particles to a one dimensional quark/gluon 
likelihood, but of course this is completely unrealistic. Familiar multivariate classifiers like 
Neural Networks or Boosted Decision Trees are designed to estimate this likelihood if properly 
trained. Unfortunately, they aren't designed to deal with a long, variable-length list of inputs. 
In other words, we can't just give them the 4-momenta for each particle in the jet and hope 
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Figure 5. The pt distributions for two quark and gluon jet samples with arbitrary normalization. 
Our samples at each pt were chosen such that anti-fcy R=0.5 jets had px values within 10% of the 
nominal value. For all QCD jets, this distribution is falling, but within our window, the pr itself 
cannot be used distinguish quark from gluon jets. On the left is the 50 GeV sample, and on the right 
is the 800 GeV sample. 

for the best. The challenge is to find simple observables that allow us to get as close to this 
ideal likelihood as possible. Since the particles are not independent, a good observable must 
extract the most important correlations. 

As in jet algorithms, we'd like to do this in a way that is infrared and collinear-safe. This 
means that a jet's score shouldn't change if one particle in the jet is replaced by two where 
either (1) both are traveling in the same direction as the original or (2) one has very soft 
momentum. Raw particle count is not infrared safe, nor are things that depend directly on 
particle count like average particle pr- Charged particle count with a minimum p<p is safer 
with respect to soft emission and also safer with respect to collinear emissions, since these 
must conserve charge. 

Infrared and collinear safety is usually framed as a strict yes/no requirement in the 
limit of exactly collinear splitting or zero momentum soft emission. In reality, by the time 
individual tracks or calorimeter deposits are observed, they would never be exactly collinear 
nor infinitely soft. Thus it is more meaningful to envision spectrum of safety. For example, 
while all of the popular iterative jet algorithms are infrared safe, counting the number of small 
anti-fcy subjets of size R=0.1 is less safe than counting larger R=0.3 subjets. Unfortunately, 
the smaller subjets turn out to be more useful, and the charged particles even more so. 

Thus we consider two main types of observables: ones that try to distinguish individual 
particles, tracks, or subjets, and ones that treat the energy or pr within the jet as a function 
of (Sy, 54>) away from the jet axis. 

The first category includes things like count, average pt, and spread (standard deviation) 
in pt for these discrete objects. Subjets can be obtained by explicit kx declustering into N 
jets, or by runnnig a jet algorithm again with a much smaller R. These have been studied 
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and confirmed extensively at LEP, and provide better discrimination at high quark efficiency 
and high px, but can be more difficult to measure at hadron collides in crowded jets. Here 
CMS particle flow or Atlas TopoClusters can extract the most information out of each jet. 
The identities of particles were not explored by us, but as mentioned above, LEP found an 
increase in baryons (protons and Lambdas) for gluon jets and an increase in kaons for quark 
jets. 

The discrete-category observables that we evaluated are listed below. In later sections, 
the useful ones are described in more detail, distributions are shown, and gluon rejection 
scores are compared for different parameters (like jet size). 

Distinguishable Objects (particles/tracks/subjets) (Section 7): 

• Particle/track/subjet multiplicity, with different subjet algorithms and sizes i2 su b 

• {pt)'- Average pt of particles/tracks/subjets within the jet 

• Higher statistical moments like (py) and a PT 

• Average distance from jet axis (r) and higher moments like (r 2 ) 

• (kx)- Average kx, the momenta transverse to the jet axis. 

• The pt fraction or AR of subjets, explicitly declustered into N jets 

• Subjet count above a particular px or a fraction of jet pt 

• Subjet splitting scale 

• Masses of subjets 

• Charge- weighted pt sum of tracks 

The second, more continuous, category includes jet mass, jet broadening, and the family 
of radial moments. These tend to be better for lower pt jets and better at achieving high- 
purity at the cost of low quark-efficiency. Some observables we evaluated are listed below 
with more detail left to later sections. 

Continuous Shapes (Section 8): 

• Jet Mass and m/pT ratio (Section 8.1) 

• Jet Shapes: integrated and differential, to some distance from the jet axis (Section 8.2) 

• Radial moments (Section 8.3) 

• "Girth" of each jet (Section 8.4) 



Jet Broadening (Section 8.4) 



• Jet Angularities (Section 8.5) 

• Optimal Radial Moment (Section 8.6) 

• N-Subjettiness (Section 8.7) 

• Two-Point Moment (Section 8.8) 

• Higher geometric moments like eccentricity or planar flow (Section 8.9) 

• Pull as a measure of color-connections (Section 8.10) 

When we describe these variables in more detail, it will be useful to have a way to score 
or rank them. In the next section, we describe our scoring methods. 

6 Evaluation of Discrimination Power 

In this section we describe ways of quantifying our jet observables' quark/gluon tagging power. 
We will then use gluon rejection at a fixed quark acceptance to rank our variables to find the 
most powerful discriminants. We will also look at how the discrimination power depends on 
things like jet px or jet size. 

One method of evaluating an observable's discriminating power the separation [23] 



where ps(x) and pb(x) are the signal and background probability density functions (PDFs) 
of some observable x. Identical distributions give zero separation, while ones with no overlap 
give a separation of one. This definition is invariant under any one-to-one change of variables. 
This invariance this fixes the exponent on the PDFs used in the numerator and denominator. 
This measure does not directly tell us how the observable will perform as a tagger, so we do 
not use it. 

Other measures of discrimination power involve the so-called ROC curve 1 shown in Fig- 
ure 6. The ROC curve is constructed by sliding a cut across the variable and plotting the 
gluon 'background' rejection against the quark 'signal' acceptance. If the distributions com- 
pletely overlapped, the result would be the diagonal line. Height above the diagonal represents 
discrimination power. One way to quantify this power is to choose a reference signal efficiency 
(80% is shown) and measure background rejection there. Variables can be ranked by rejection 
power at this chosen signal efficiency. 

Variables with more complicated distributions might require a two-sided cut with the 
signal either inside or outside the cut. For two-sided cuts there is no unique background 
rejection for a given signal efficiency, so the best value is used. Even more complicated 
distributions, including multivariate ones, require more general boundaries (a contour in 

1 The name comes from radar. It stands for Receiver Operating Characteristic. 
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mass/Pt 



ROC Curve for mass/Pt 




Figure 6. On the left is the distribution of jet mass divided by px for quark and gluon jets of 
Pt ~ 200 GeV. In all plots that follow, quarks will always be solid blue and gluons hashed red. These 
distributions are normalized to equal area. Every cut leads to a particular efficiency for keeping quarks 
and a (hopefully lower) efficiency for keeping gluons. One minus this gluon efficiency is the background 
rejection. The curve formed by all possible cuts is called the ROC curve, and is shown on the right. 
The particular point shown corresponds to keeping 80% of the quark signal while rejecting 50% of the 
gluon background. 



2D, a surface in 3D, etc.) Transforming the (possibly multidimensional) quark and gluon 
observable distributions into a single likelihood q/(q + g) distribution always allows for a 
single-sided cut on this likelihood. If the multidimensional distributions were known a priori, 
a sliding cut on this likelihood would form the best possible ROC curve. Any one-to-one 
map of this likelihood like log(q/g) can also be cut on to produce an identical ROC curve. A 
good multivariate discriminator (i.e. neural net or boosted decision tree) estimates one such 
one-to-one map given limited training data. 

Sometimes ROC curves for different variables cross. We will find, for example, that 
at high signal efficiency (a loose cut), counting the charged tracks is the best observable, 
while for low signal efficiency (a tight cut), observables like jet broadening are best. In these 
situations, there is no unique way to rank the relative discrimination power since different 
variables are superior for different signal efficiencies. The area under the ROC curve, which 
is equivalent to averaging the rejection power over all signal efficiencies, provides another 
measure of discrimination power without having to pick a reference signal efficiency. Ranking 
the variables by area turns out to be very similar to ranking them by their background 
rejection at around 50% signal efficiency, where the distance to the diagonal is greatest. 

Any measure of the discrimination power of a variable is going to depend on the px of 
the jet, along with other parameters like the the jet algorithm or jet size. It will also depend 
on the source (pythia, herwig, data), although robust and well simulated variables should 
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Figure 7. Charged track count for anti-fcr R=0.5 jets, normalized to equal area. Left: 100 GeV jets 
with quark jets in solid blue and gluon jets in hashed red. Right: All pt samples where solid is quark 
and dotted is gluon. For a variable that is less sensitive to the jet pr, the count could be divided by 
the log of the jet pr- 



minimize this dependence. For these reasons, we'll do our best to display and summarize our 
findings, but the final recommendation will always be subjet to validation on real data. 

7 Discrete (Particle/Track/Subjet) Variable Results 

For discrete variables our results can be summarized simply as "smaller is better" . Counting 
all particles gives the best discrimination power, although since neutral particles often cannot 
be distinguished, especially in a high-pileup environment, using all particles may not be 
practical. If all particles are not available, the next best option is counting charged tracks. 
We define charged tracks to mean all charged particles in the jet above 0.5 GeV. Distributions 
of charged track count is shown in Figure 7, and the gluon rejection for all px samples, jet 
sizes, and three different quark efficiencies are shown in Figure 8. 

After charged particle counts, the smallest subjets do best. Distributions for anti-fc^ 
R=0.1 subjets are shown in Figure 9. The rejection power of subjets of many types and 
sizes is shown in Figure 10. The smallest have R su b = 0.1, the approximate resolution of 
distinguishable TopoClusters. 

Finding the average particle/track/subjet pt and normalizing to the jet px gives no more 
information than the count. Finally, the standard deviation of subjet pts (also normalized 
by the jet px) is useful, but not more than the counts, and also not as useful when combined 
with other observables. It is not shown here. 
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Figure 8. For a single variable, in this case charged track count, each panel is the gluon rejection for a 
different quark fraction. A mild 80% cut is shown on the left and the harshest 20% cut is on the right. 
Each plot shows the gluon rejection percentage (vertical axis) as a function of jet size (horizontal axis). 
The different lines in each plot correspond to different jet prs, with the red (bottom) being 50 GeV 
and going by factors of two until the the purple (top) at 1600 GeV. The vertical scale is different for 
each plot, but higher rejection is always better. Jet sizes between R = 0.5 and R = 0.7 achieve the 
best rejection. Similar to all count-type variables, higher p^ jets can achieve better gluon rejection 
because the shower has more 'time' to establish the different particle counts. 




Figure 9. Subjet count for anti-fey R=0.5 jets and subjets using anti-fey R=0.1, normalized to equal 
area. Left: 100 GeV jets. Right: All p T s. 



Subjet Count 



1st Subjet's pt Fraction 



100 r 




Subjet pt Standard Deviation 



100 r 





0.4 0.6 0.8 

2st Subjet's pt Fraction 




Figure 10. For subjets, smaller is better. Gluon rejection at 50% quark acceptance is plotted as a 
function of initial jet size Rj e t- These scores are averaged over all jet pr bins from 50 GeV to 1600 GeV. 
The color corresponds to the subjet algorithm, with anti-fey in red being slightly better than CA in 
green, which is slightly better than kr in blue. As for subjet size, the darkest color corresponds to the 
smallest and best subjet size of i? su b =0.1. Lightest is the largest and worst subjet size of i? su b = i?j C t- 
These trends hold even for subjet variables not plotted. 
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8 Jet Shapes and Geometric Moments 



Unlike counting tracks or taking pr of the hardest small subjet, the continuous category 
requires more detailed definitions. We therefore provide explanations along with the results 
in this section. We first discuss the simplest jet shape, jet mass. We next discuss what is 
traditionally called the jet shape, including its integrated and differential versions. We then 
describe some useful variables like angularity and girth which are basically radial moments 
of jet shape. Next we consider more complicated observables like N-subjettiness and the 
moments of a two-point function. Finally we describe 2D moments in the (77, </>) plane like 
planar flow and pull. 

8.1 Jet Mass 

A jet's 4-vector is obtained by adding up the 4-vectors of all of the jet constituents. As long 
as the constituents are not collinear, the resulting jet 4-vector will be massive. This jet mass 
measures how spread out the constituents of the jet are. Distributions of jet mass normalized 
to jet pt for different samples, along with their gluon rejection power is shown in Figure 11. 
There is already data [27] and theoretical calculations [28, 29] of jet mass at the LHC. 

8.2 Traditional Jet Shape 

The Jet Shape is an example of an IRC safe observable that is commonly used in jet-property 
studies. Each jet has its own integrated jet shape ^(r), which measures the fraction of the 
jet's total pt that falls within r of the jet axis. This is illustrated in Figure 12 and defined 

|Mass/pr I- Q ooso 




Figure 11. Jet mass over jet px for anti-fey jets. Left: Different jet prs for R=0.5 jets, normalized 
to equal area. Right: Gluon rejection scores as a function of jet size for different pr samples. Red is 
50 GeV, yellow is 100 GeV, and so on, doubling through the spectrum until purple at 1600 GeV. The 
best gluon rejection occurs between R=0.3 and R=0.5. 
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Figure 12. The Integrated Jet Shape *f?(r) is the fraction of the pr of a jet of cone size R falling 
within a smaller cone of size r, as illustrated in the far left panel. \P(r = i?j c t) = 1 by definition. 
In the center is a plot of the integrated jet shape averaged over all observed jets of a particular type 
(here our quark and gluon dijet sample). On the right the distribution of r = 0.1 jet shapes is shown. 
The mean of these distributions gives ^(0.1) for quarks and gluons. The distribution is clearly not a 
simple Gaussian centered around the average value, indicating that much information is discarded in 
considering only the integrated jet shape. The rise at low r is due to jets where the parton underwent 
a semi-hard splitting leading to little p? deposited along the jet axis. 



more precisely as 



An important distinction must be made between this definition, which is different function 
for each jet, and what is commonly plotted as 'jet shape,' which is averaged over all jets seen 
by a detector (with some cuts.) In Figure 12, this averaged integrated jet shape is the left 
plot, whereas the distribution of integrated jet shapes out to a single radius of r = 0.1 is 
the right plot. The distribution is clearly not a gaussian centered around the average value. 
Given 'I'(O.l) for a particular jet that you want to classify, it's more useful to know the full 
distribution for quarks and gluons than just the two average values. Historic measurements 
and calculations are for the average rather than the full distribution. The same is true for jet 
masses: often average masses are calculated and measured for different prs rather than mass 
distributions. 

Measurements at CDF agreed well [30] with Pythia Tune A and Herwig out to pj? = 
380 GeV. At higher pr, shapes got narrower, which is consistent with the mix of quark and 
gluon jets evolving from 27% quark at 50 GeV to 80% quark at 350 GeV. Early ATLAS 
data also agrees moderately well [25] with simulations. When used event-by-event, often a 
particular annulus was chosen to be integrated over, for example 0.2 < r < 0.7 in the CMS 
jet shape briefs [31] and [32]. At the Tevatron, CDF chose 0.3 < r < 0.7 [30]. This particular 
choices were not optimized for distinguishing quarks from gluons. 
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8.3 Radial Geometric Moments 

We refer to any geometric moment that is linear in pr and independent of angle around the 
jet axis as a radial moment. Linearity in px is required for IR/collinear safety. Specifically, 
the pt in each radial bin is weighted by a kernel f(r) and summed up to form the moment 
M f . 

Radial moment using kernel f(r) Mf = ~jejfi r i) (8-2) 

iejet Pt 

Distances r of each particle or cell from the jet center are calculated on the (rapidityphi) 
cylinder. The jet center is taken as the (y, 4>) of the jet's 4-vector, but the pr-weighted 
centroid is almost identical. It is important to use rapidity rather than pseudorapidity for the 
jet location because the jet is massive. A radial moment sums a function of these distances, 
weighted by pt, then normalized to the total p? of the jet. Energies and angles, rather than 
Pts and r's give similar results, but are less appropriate to hadron colliders. 

The integrated jet shape ^(O.l) corresponds to the moment where f(r) is 1 out to r = 0.1 
and beyond. The differential jet shape "0(0.3) corresponds to a kernel that is 1 in a small 
window around r = 0.3. One series of kernels are powers of r: r, r 2 , r 3 , • • • . These most 
closely correspond to the traditional geometric notion of 'moments.' Radial moments like 
these are interesting because it may be possible to calculate them accurately in QCD, see for 
example [33]. 

An orthonormal set of kernel functions fully characterizes the radial distribution of px for 
a single jet, but even knowing the ID distributions for an infinite set of orthogonal functions 
would not give complete information about the underlying high-dimensional distribution with 
all correlations preserved. In other words, knowing this series for a particular jet would allow 
a full reconstruction of where the pt in that jet goes, but the same isn't true for the ID 
distributions. 

8.4 Linear Radial Geometric Moment: Girth, Width, or Jet Broadening 

The linear radial moment, or girth, is a special case of a generic radial moment with f(r) = r. 
For discrete constituents, it is defined as 



Girth : g = £ ^r t . (8.3) 



Pt_ 

jet 

iejet Pt 



The girth distribution is shown in Figure 13. 

ATLAS calls this variable width. This is a hadron-collider version of a popular LEP 
variable called jet broadening. Jet Broadening, as measured at ALEPH [7] and OPAL [8], 
leads to distributions very similar to the linear moment, simply because the small-angle 
approximation of hr ~ PT r is valid. At LEP, jet broadening was given by 



Ej \Pi X "-jet | = Ej IM 



B . et = ^iig ^ = ^1±1L . ( 8 .4) 
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Figure 13. Girth (also called width, or the linear radial moment) for anti-fcr jets. Left: Different jet 
PtS for R=0.5 jets, normalized to equal area. Right: Gluon rejection scores as a function of jet size 
for different pr samples. Red is 50 GeV, yellow is 100 GeV, and so on, doubling through the spectrum 
until purple at 1600 GeV. For the lower pt samples, the best gluon rejection occurs around R=0.5. 



We examined higher-power radial moments, and found them less useful for quark/gluon 
discrimination. CMS [35] has examined the second radial moment. For small narrow, nearly 
transverse jets, the second moment is equivalent to the jet mass. Higher-powered moments 
have the disadvantage of being most sensitive to the edges of the jet, where the most con- 
tamination lies. 

On the other hand, we have found that a very good discriminator uses a lower power, 
specifically the square root of the distance: 

iejet Pt 

8.5 Jet Angularities 

Jet Angularities are also radial moments, but their "radial distances" are rescaled into the 
angular coordinates appropriate for e + e~ event shapes. They are defined by [36] as 

Jet Angularities : A a = ^ f a (0) , (8.6) 

with 

f a (6) = sm a 6 (l - cos 0) 1_ ° and 9 = ^ (8.7) 

for a < 2. The kernel function f a (9) is inspired by full event-shape angularities [37], but 
squished so that the edge of a jet at \n\ = R is mapped to it/2. Profiles for different choices of 
the a parameter are shown in Figure 14. Note that the energies Ei are used in the definition, 
instead of the pts popular with hadron colliders. Also, angularities are often normalized by 
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Figure 14. Profiles f a {6) for different choices of the angularity a parameter spaced at 0.1 intervals 
(in rainbow) and linear radial- moment "girth" (in black) . These profile shapes have nothing to do with 
the shapes of the distributions resulting from integrating these moments over jets and histogramming 
the results. 




80% quark 50% quark 20% quark 



Figure 15. Gluon rejection power for angularities as a function of angularity parameter a. Each line 
represents growing jet size from R=0.2 in red up to R=1.0 in purple. Here the scores for all pxs are 
averaged. The best angularities perform slightly better than masses, but worse than track and subjet 
counts. 

the jet mass, but this is not the most useful for our purposes. A given angularity has two 
parameters (Rjet and a) in addition to any discrete choices like normalization (none, jet mass, 
jet pt, jet E) or angle used (9 as defined, or geometric 6.) Gluon rejections for different 
choices of a are shown in Figure 15. 

8.6 Optimal Kernel for Radial Moment 

Rather than sticking to powers of r, sines and cosines (like angularity), or another orthonormal 
basis, we looked for the kernel f(r) that gives the best discrimination power between quarks 
and gluons for each px- Because the goal is to find the best function, the optimization 
problem is technically infinite dimensional. But through reasonable smoothness criteria, it 
can be reduced to adjusting a few control-points of a spline or coefficients of an orthonormal 
basis. Since adding a constant doesn't change the discrimination power, we chose our kernels 
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Figure 16. Profiles for the optimal kernels found for various jet sizes. Kernels for the higher-p^ jets 
give a higher weight to pt near the jet axis. 

to have /(0) = 0. This means that the energy deposits near the crowded and noisy jet 
center count least. Multiplying by a constant (even a negative one) also does not affect 
discrimination power, so we normalized our trial profiles so their maximum value was +1. 
We evaluated the ROC curve at three different quark efficiencies, 20%, 50%, and 80%. 

The best kernels we found had rejection scores that were not significantly higher than 
those for girth (equation 8.3) or the square-root profile (equation 8.5). For this reason, we 
won't go into much more detail. Some general trends did appear. By construction, all kernels 
started out at zero at the center of the jet and rose to +1 at some distance away. In the best 
kernels, this happened around r = 0.4 for low-pT jets, 0.3 for 100 GeV jets and 0.24 for 400 
and 800 GeV jets. Beyond this, it mattered less what happened, but the best kernels did fall 
toward the edge of the jet. Examples of such kernels are shown in Figure 16. 

8.7 N-subjettiness 

N-subjettiness [38] is a family of jet shapes that attempt to characterize the degree to which a 
jet has exactly N subjets. N is one of the input parameters, and is commonly taken to be 1, 
2 or 3. A-subjettiness finds exactly A axes within the jet and associates each particle or px 
deposit to the nearest axis. These are the TV" subjets. The iV-subjettiness score tn is sum of 
PT-weighted radial moments for each subjet. In this moment, each bit of pr is multiplied by 
its distance to the subjet axis AR raised to a power /?, which must be positive. Specifically, 
this is 



where do is a normalization involving the jet size Rq to keep r between zero and one: 



feGjet 

There are three parameters: N, the exponent /3, and the method of choosing axes. A simple 
way of choosing A axes is to undo a kx or Cambridge- Aachen clustering exactly A steps. A 




J=0 fcGsubjetj 
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more effective method is to choose the axes that minimize the score r. It's clear from the 
definition that the more a jet looks like it contains exactly N well-separated and individually 
well-collimated subjets, the lower the r score is. 

In the plots in Figure 17, only charged tracks were included at ATLAS's request. If a 
jet contains only N charged tracks (with px > 1 GeV here), the axes will coincide with these 
tracks and the score will be zero. It will also be zero if there are fewer than N tracks. This 
explains the strange-seeming excess in the zero-bin. TV-subjettiness is also appealing because 
it can be calculated accurately in QCD, at least in some contexts [39]. 

N-Subjettiness with Optimized Axes for Anti-kT R=0.5 Jets 
Gluon Rejection at 50% Quark Acceptance for 50 GeV Jets 



0.82 h 




Figure 17. N-subjcttincss gluon rejection as the parameters N, j3, and jetsize are varied. This is 
for 50 GeV jets simulated in Herwig++. The best /3s are between 1/2 and 1. As usual, the best jet 
sizes are between 0.5 and 0.7. These trends hold for Pythia8 and for higher pr jets. 
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8.8 Two-Point Moment 



A two-point moment is a sum over every pair of constituents (energy deposits or tracks). It 
is a sum of the product of pts of each pair, times their separation Ai? raised to a power /3. 
It is normalized by the jet p\ to make it dimensionless and less sensitive to the jet pt itself. 

\Pt > iejctjejct 

It is a moment of the two-point function, which would be a function of AR. As long as /3 > 0, 
this is IRC safe. This is meant to capture an average separation between constituents. In 
Figure 18, the gluon rejection is shown as a function of the jet size for different values of f3. 



Two-Point Moment for different fl's 
Gluon Rejection at 50% Quark Acceptance for 50 GeV Jets 
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Figure 18. Two-Point Moment gluon rejection as the jetsize and /3 parameter are varied. This is 
for 50 GeV jets simulated in Herwig++. The best /3's are small, around 1/4. As usual, the best jet 
sizes are between 0.5 and 0.7. These trends hold for Pythia8 and for higher pt jets. 
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8.9 Two-Dimensional Geometric Moments 

The radial moments above ignored how the pt was distributed around the jet axis. Motivated 
by the moment-of-inertia and covariance tensors, a second order 2D geometric moment tensor 
can be formed as shown in Figure 19. Combinations of its eigenvalues and eigenvectors (like 
Planar Flow) have been used used to distinguish boosted objects. 

None of these variables turn out not be particularly useful for quark/gluon discrimination, 
so no distributions are shown here. Whether a quark emits a gluon or a gluon splits, the the 
2-body kinematics are similar. Since it's this leading emission that dominates the subsequent 
shower, it is understandable that these shapes might not differ significantly between quarks 
and gluons. 





Combination of Eigenvalues 



Eigenvalues: a > b 



Quadratic Moment: g = V a 2 + b 2 
Determinant: det = a ■ b 
Ratio: p = b/a 




Eccentricity: e = \J a 2 — b 2 
Planar Flow: pf = T^ffW 
Orientation: 6 



Figure 19. The Covariance Tensor and its eigenvalues and eigenvectors. 
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8.10 Pull 



In figure 20, we show accumulated px for the same quark and gluon parton showered millions 
of times. In the large Nq approximation where these concepts apply, quarks have a single 
color connection and gluons have one color and one anti-color connection. In this particular 
event, the quark was color-connected with the beam remnant that went off to the left toward 
77 = — oo. The gluon was connected to both outgoing beams. 

Pull tries to quantify the color connections. It was introduced in Ref. [34], and then 
immediately used in the D0 search for ZH with Z — > vv [40]. The pull vector of a jet 
is designed to point toward the jet or beam that its color-connected to. The pull vector is 
a px weighted moment that tends to point toward the color-connected partner of the jet's 
initiating quark. If the jet was initiated by a gluon, it is color-connected two two different 
places, so we might expect less pull. The pull vector is defined as 

Pull Vector i = ^ ^j^r n where n = {yi - y iet , fa - je t) • (8.11) 
iejet Pt 

If the factor of \ri\ were removed, this would be the jet's p^-weighted centroid. Unlike other 
moments, the pull vector is explicitly designed to not be rotationally invariant. The most 
effective way to use the pull vector in the Higgs study was to calculate a pull angle, which 
is the angle between the pull vector and the direction where it 'should' point if the jet was 
color-connected to some other object. We did not find pull angle very useful in distinguishing 
quarks from gluons. 




Figure 20. Distribution of radiation in quark and gluon jets accumulated over 3 million back-to-back 
dijet events with fixed parton kinematics. The color shows the average showered pr density in (77, <p) 
for an ensemble of events with fixed parton-level kinematics. (Contours are stepped in factors of two, 
which somewhat obscures the nearly identical jet px in both cases.) 
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9 Combining Variables 



A multivariate tagger can make the best use of several variables at the same time. In Figure 21, 
the 2D distributions of a good pair of variables is plotted for the quark and gluon samples. 
To find the best cut contours, one method is to combining these histograms into a likelihood 
histogram. This is done bin-by-bin by reading the values of the quark and gluon histograms 
and computing q/(q + g). If particular values are measured for each of the two variables, 
this likelihood is proportional to the probability that it is a quark jet. The constant of 
proportionality will depend on the prior distribution of quarks and gluons in your sample via 
Bayes' Theorem, but does not affect the contours. 

A cut on on this likelihood score corresponds to a cut along some contour in the 2D 
plane. Each such cut gives some efficiency for keeping quark jets and some other efficiency 
for rejecting gluon jets. Cutting on the likelihood is optimal in the sense that it maximizes 
gluon rejection for every given quark acceptance [23]. Some ways of visualizing the effects of 
cuts and multivariate improvements were discussed in [41, 42]. 

To populate a 2D histograms such that each bin has a statistically meaningful number is 
difficult without an enormous number of events. For more than 2 variables, it is practically 
impossible to populate the higher-dimensional histograms with any accuracy. For example, 
for 5 variables, even if each variable had only 10 coarse divisions, there would still be 10 5 bins 
to populate. This is where multivariate techniques like Boosted Decision Trees are useful [23]. 
Using a limited number of training events, these techniques assign a score to each point. With 
a large enough training sample, this score is in 1-1 correspondence with the likelihood. 




Figure 21. Combining Variables: 2D distributions are shown for a powerful pair of variables. The 
Likelihood can be formed by combining these histograms bin-by-bin as q/(q + g), where q and g are 
the fraction of events in the appropriate bin of the quark and gluon histogram, respectively. The blue 
regions mean that an event with that pair of values is more likely to be quark. A cut on the likelihood 
correspond to a cut along one of the contours, and this can be proven to be the optimal cut for that 
signal efficiency. 
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10 Comparing Variables 



A pair of variables that always performs at or near the top in our multivariate rankings is 
charged track count combined with girth (also known as the linear radial moment jet width). 
This pair was shown in Figure 21. In a computationally intensive search, we also came up 
with the best group of five variables, which differed for each pj- window and ranking method 
(gluon rejection at 80%, 50%, or 20%.) 

The ROC curves (gluon rejection versus quark acceptance) for interesting variables are 
shown in Figure 22 for j>t=100 GeV jets. Unlike previous curves, the rejection is plotted on a 
logarithmic axis. A 1% background acceptance corresponds to a background rejection of 10 2 . 
The best curve corresponds to the best group of five variables, but the best pair (charged 
track multiplicity & girth) is not far behind. Simply taking the product of these two creates 
a single variable that does better than each individually, but only for for harder cuts (lower 
quark acceptances.) The good discrete observables like counting charged tracks or counting 
small subjets do best at high quark acceptance (and as we saw before, high jet pr)- The 
good continuous observables like girth tend to do best at lower quark acceptance and lower 
jet pt- Jet mass tends to be somewhere in the middle, and 2D geometric moments like pull 
and planar flow are never particularly powerful. 

The best variable depends on the desired signal acceptance operating point, which de- 
pends on the application. For example, one might try to maximize the significance (a ~ 
S/y/B) of a small signal above a large background. An advantage of maximizing the sig- 
nificance is that, each operating point translates into an improvement factor (which should 
be greater than one if the cut is useful), independent of the initial significance. This im- 
provement factor is also independent of integrated luminosity and the signal and background 
counts themselves. To see this, note that cutting on a variable changes the significance by 



With a simple 1-1 transformation, a ROC curve can be turned into a significance improvement 
curve (SIC) [41]. Samples are shown in Figure 23 for the 100 GeV sample. 

For all variables, the cuts that optimize quark/gluon significance improvement tend to 
be quite harsh, leaving only ~20% of the quark sample. For rare signals with few events, 
looser cuts might be required to see any events at all. In cases where the background to a 
quark jet new physics signal is not 100% gluon-jets, looser cuts end up giving the optimal 
significance improvement. QCD backgrounds at low pt are only around 80% gluon. This is 
discussed further in Section 12. 

The gluon rejection curves for the best group of five variables for each of the pt samples 
are shown in Figure 24. There is one best group of five when optimizing rejection at 20% 
quark acceptance, and a different group of five when optimizing at 80%. Transformations of 
these into Significance Improvement Curves is shown in Figure 25. The exact scores depend 
on whether Pythia8 or Herwig+- 1- is used and whether all particles or just charged tracks are 
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Figure 22. ROC curves for pT=100GeV jets for selected variables. These curves show the back- 
ground (gluon jet) rejection efficiency (1/eb) as a function of the signal (quark jet) acceptance efficiency 
(es). 



used. This is discussed further in Section 13 and numerical results are shown at the end, in 
Table 1. 
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Figure 23. Significance Improvement Curves for px — 100 jets for selected variables. These curves 
show the significance improvement £,s/\/£b as a function of es- 
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Gluon Background Rejection 
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Figure 24. Gluon Background Rejection for the best groups of five variables for each px- The dashed 
lines are the groups that maximize gluon rejection at 80% quark efficiency. The solid lines maximize 
significance improvement, which the next figure shows happens around ~ 20% quark efficiency. Higher 
Pt jets lead to greater rejection power, mostly because the charged track count is a more powerful 
observable. 
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Figure 25. Significance Improvement curves for best groups of five variables. This contains the 
same information as the previous figure, but shows that small differences in gluon rejection at low 
efficiencies lead to large differences in significance improvement. 



11 Using Impure Samples to Verify Underlying Pure Distributions 

When training a multivariate method to distinguish quarks from gluons, it would be ideal to 
have huge samples of pure quarks and pure gluons. The quark fractions of several samples are 
shown in Figure 26. For low jet pt, none of the samples are more 90% pure. By making cuts, 
this can be increased at the cost of having fewer training events. For example, in 7 + 2jet 
events, when the softer jet is close to the photon, it is very likely a quark jet. Similar cuts 
can be made to purify gluons from multijet samples. This was discussed in reference [3]. 

Ideally, one would like to combine information from high cross section, low-purity samples 
with low cross section, high-purity samples. One approach to combining information from 
different samples would be to first verify the jet property predictions Monte Carlo generators. 
If the Monte Carlo generators are sufficiently accurate, huge numbers of simulated events 
can then be used to train multivariate classifiers. The distributions of each jet property and 
correlations between them can be checked against data. For example, the jet mass distribution 
of a simulated 90/10 mix should match a 90% pure sample. If two observables provide most 
of the discrimination power, their 2D histogram can be compared to the linear combination of 
pure quark and gluon jets produced by the Monte Carlo. Low statistics, high purity samples 
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Figure 26. The chance that a given jet is a light quark jet rather than a gluon jet. (This ratio does 
not include bottom or charm.) The W and Z were nearly identical and combined on this plot, but 
they are slightly different from the photon, mostly due to the 7 and lepton cuts, which were each at 
20 GeV. 
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will constrain some combination of measurements, while high statistics mixed samples will 
constrain a different combination. 

More generally, each jet can be assigned a likelihood of being quark or gluon based on its 
kinematics (nearness to a photon for example) rather than its intra-jet properties. If it passes 
some kinematic threshold, the jet and its likelihood can be used to train a classifier. More 
certain probabilities can be assigned higher weights. Although the multivariate techniques 
popular in high energy physics and implemented in the TMVA package [23] require a separate 
signal and background samples, within each sample events can be assigned different weights. 

An alternative data-driven approach is to fit for the underlying pure distributions in an 
impure sample where the quark/gluon fractions are assumed to be known. For example, in 
a small window around 50 GeV, dijets with certain kinematic cuts are 23% quarks. Jets in a 
7+jet sample are 88% quarks. If the jet mass distribution is measured for these two 50 GeV 
samples, the underlying quark and gluon distributions can found by solving a simple 2x2 linear 
equation bin-by-bin. These pure underlying quark and gluon mass distributions can then be 
re- weighted and compared to a sample with a different known quark/gluon fraction. If there 
is consistency across samples, it would justify the use of quark and gluon jet discrimination. 
ATLAS calls this the template method [26, 43]. In this approach, bottom and charm jets 
are accounted for by trusting the Monte Carlo simulation for both their distribution and the 
value of their small composition fraction. 

Fitting to pure or mixed samples assumes a universality to quark and gluon properties, 
which may only partly hold. Selection effects might induce atypical distributions, for example 
if harsh cuts keep only the kinematic tails of distributions. For example, jet mass might look 
different at very high rj. One sample may be busier than another, with many other nearby 
jets leaking into the ones you are interested in. Jets in a training sample may have different 
color-connections than jets in the sample you ultimately wish to tag. Color-singlet quark pairs 
from a W might look different than beam-connected quarks in 7+jet. Many of these issues 
are not particular to quark/gluon tagging and should be kept in mind for any substructure 
study. 

12 Choosing the Operating Point for a Mixed Background 

The cut that most improves significance depends on the quark/gluon composition of the real 
signal and background. So far we have considered the signal to be only quark jets and the 
background to be only gluon jets. QCD has around 20% quark jets at low pr and more for 
higher p?. Other common backgrounds were shown in Figure 26. 

Once a tagger is trained, you need to pick an operating point. For example, you could 
pick the fraction of quark jets you are willing keep: the quark efficiency e q . This translates 
into a particular cut on the observables, either directly or via a cut on a multivariate output. 
The ROC curve shows the gluon efficiency e g (the fraction of gluon jets that get past the best 
cut) for a given e q . 
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If the signal is not pure quark and the background is not pure gluon, a cut with these 
quark and gluon efficiencies will translate into signal and background efficiencies in a way that 
depends on: the fraction of signal made of quarks s q , signal made of gluons s g , background 
made of quarks b q , and background made of gluons b g . In this case, 

e s = s q e q + s g e g and e b = b q e q + b g e g . (12.1) 

Now suppose you start with S signal events and B background events. A cut with signal 
efficiency e s and background efficiency e b changes the statistical significance in a simple way 
as before: 

o = 4= "> 4^= = (12-2) 



If the signal started with some significance a, the cut will improve it by a factor € s /y/%, 
which is called the "Significance Improvement" in the plots below. It depends not only on 
the performance of the tagger, but on the quark/gluon makeup of your signal and background 
and where one chooses to operate. A ROC curve for quark/gluon discrimination (e g vs e q ) can 
be easily transformed into a Significance Improvement Curve (e s / ' y/ei, vs e s ) using equations 
12.1 and 12.2. The first plot in Figure 27, several such curves are shown for signals that 
are 100% quark, but backgrounds that are mixed. The curve labeled 0% has a background 
that is purely gluon jets and corresponds to the curves shown previously. Even when the 
background has only 10% quark jets, the maximum achievable significance improvement has 
fallen quite a bit. The second plot in this figure shows those maxima as a continuous function 
of the background's quark fraction. A typical low-p^ QCD background has 15% quarks and 
is indicated. Clearly the tagger will be most useful when the signal and background are both 
quite pure. 

The same analysis can be performed for the published £?-tagger ROC curves. Generic low- 
Pt QCD backgrounds have around 2% U-jets. For this value, the significance improvement 
peaks at around 60% B-acceptance. This is the typical operating point of these taggers. 
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Figure 27. On the left is an illustration of how the significance improvement curve changes when the 
background is not pure gluon jets, but contains the indicated fraction of quark jets. The points show 
the maximum significance improvement. On the right, these maxima are plotted a function of the 
quark fraction in the background. The QCD background for jets with pr > 20 GeV is approximately 
15% quark. For higher px, the QCD quark fraction goes up and the maximum significance decreases. 
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13 Comparing HerwigH — \- to Pythia8 

Recently, ATLAS presented some quark/gluon measurements [43] of charged track count and 
linear radial moment, which they call jet width. They compared their data to Herwig++ and 
to two Pythia8 tunes. They found that both simulations described the quark jet properties 
better than gluon jet properties. 

Our simulation, shown in Figure 28, indicates that various simulations agree with each 
other for quark jets. We find that the distributions for gluon jets in Pythia 8.165 are consis- 
tently farther away from the quark distributions than they are in Herwig++ 2.5.2. ATLAS 
found that Herwig++ agreed with data better for charged tracks, but Pythia8 agreed better 
for width. In both the data and Herwig++, quarks and gluons look more similar to each 
other than they do in Pythia8 (which was used for most of the plots in this paper). Gluon 
rejection is consistently around 10% worse for Herwig++ than for Pythia8. As a function of 
things like jet-size or radial moment power, or the number of subjets, the difference between 
Herwig+- 1- and Pythia8 is just an overall shift. This means that all of the single-variable 
trends described in the bulk of this paper still hold. We find less advantage in combining 
variables with multivariate techniques for jets simulated using Herwig++ than with Pythia8. 

ATLAS also found that various jet grooming techniques not only helped with pileup, 
but the groomed jet mass was better simulated than the ungroomed mass. To reject pileup, 
ATLAS also used only the charged tracks, thus ignoring their calorimeter for everything 
except the overall jet pr- To account for this, in our comparison of Pythia8 to Herwig+- )-, 
we use only charged tracks. A summary and comparison of various simulations using all and 
just charged tracks is shown in Table 1. 

Combining variables sometimes helps, but mostly for harder jets. At 50% quark efficiency, 
combining radial moment with track count gives an additional 0.4% to 1.9% gluon rejection, 
depending on jet px and generator. For Herwig++, this improvement increases with jet pt, 
but for Pythia8, it maxes out at 200 GeV. These small shifts should be compared to gluon 
efficiency, not gluon rejection (one minus efficiency). For Pythia8 200 GeV jets, going from 9% 
efficiency down to 7% efficiency has a sizable effect on significance improvement. It directly 
translates into a 20% better S/B. But Herwig's efficiency here starts at the higher 19% and 
only goes down to 17.3% - a more modest 9% improvement in S/B. The improvements for 
50 and 100 GeV jets are smaller than for 200 GeV jets. 
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Figure 28. Distributions of charged track count and linear radial moment (here calculated using 
only the charged tracks within the jet) for 50 GeV jets. Quark samples are blue and Gluon samples 
red. Pythia 8.165 is the lighter shade and Herwig++ 2.5.2 is in the darker shade. 
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14 Conclusions 



In this paper, we have performed a comprehensive Monte-Carlo-based mulitvariatestudy of 
how quark and gluons can be distinguished on an event-by-event basis. We considered thou- 
sands of variables, which can generally be classified in two broad classes: discrete variables 
that count the number of tracks in a jet, and continuous variables, such as jet shapes. A 
general conclusion is that discrete variables help more at high jet px and high quark effi- 
ciency (loose cuts), whereas continuous variables help more at lower jet px and lower quark 
efficiency (tight cuts). Overall discrimination also works better at higher px- 

We find, not surprisingly, that the more information that is used, the better the dis- 
crimination: counting smaller subjets work better than larger subjets; using all particles to 
calculate jet moments rather than just charged tracks works better. For charged track count, 
even using larger R=0.7 jets helps. The best two-variable discriminant often involves one 
variable from each class. Including three or more variables in a mulitvariate discriminant 
shows only small improvements over two variables. The marginal improvement of combining 
variables is also larger at high px than at low px- 

We have found that there is a consistent difference in how variables perform when Pythia8 
or Herwig+- (- is used for the simulation. Pythia8 simulations show more differences between 
quarks and gluons, but unfortunately, early studies with data indicate that Herwig+- 1- more 
accurately predicts quark/gluon discrimination power. We find that single variables can reject 
about 90% of gluon jets, keeping 50% of quark jets if Pythia8 can be trusted, but only around 
80% of gluon jets with Herwig+- K Some quantitative results are shown in Table 1. Since 
there are significant differences between different event generators, one use of quark and gluon 
tagging would be to tune these generators to match data more accurately. Such improved 
tunings could have important implications for a number of substructure based analyses. 

In conclusion, quark/gluon discrimination is difficult, but not impossible. 85% gluon 
rejection at 50% quark acceptance seems feasible, however further input from experiment is 
needed. 

Acknowledgments 

The authors would like to thank D. Mateos, D. Miller, A. Schwartzman, M. Silva and M. 
Swiatlowski for useful discussions. This work was supported in part by the Department 
of Energy under grant DE-SC003916. Computations for this paper were performed on the 
Odyssey cluster supported by the FAS Research Computing Group at Harvard University. 



- 41 - 



Gluon Efficiency % at 






50GeV 






200 GeV 




50% Quark Acceptance 




Particles 


Tracks 


Particles 


Tracks 






P8 


H++ 


P8 


H++ 


P8 


H++ 


P8 


H++ 


2-Point Moment /3=l/5 




8.7* 


17.8* 


13.7* 


22.8* 


8.3 


15.9 


13.2 


19.6 


1-Subjettiness /3=l/2 




9.3 


18.5 


14.2 


22.9 


7.6 


16.2 


12.3 


19.4* 


2-Subjettiness /3=l/2 




9.2 


18.6 


13.9 


23.6 


6.8 


15.7* 


9.8 


18.7 


3-Subjettiness /3=1 




9.1 


19.3 


14.6 


24.4 


5.9* 


16.7 


8.6* 


19.5 


Radial Moment /3=1 (Girth) 




10.3 


20.5 


16.1 


24.9 


11.2 


18.9 


15.3 


21.9 


Angularity a = +1 




10.3 


20.0 


15.8 


24.5 


12.0 


19.3 


14.0 


21.6 


Det of Covariance Matrix 




11.2 


21.2 


18.1 


27.0 


9.4 


20.9 


13.5 


24.6 


Track Spread: \J < Pj- >/p*t 




16.5 


25.3 


16.5 


25.3 


9.3 


20.1 


9.3 


20.1 


Track Count 




17.7 


26.4 


17.7 


26.4 


8.9 


21.0 


8.9 


21.0 


Decluster with k^, Ai? 




15.8 


24.5 


20.1 


28.4 


13.9 


20.1 


16.9 


23.4 


Jet m/pT for R=0.3 subjet 




13.1 


25.9 


16.3 


27.7 


11.9 


24.2 


14.8 


26.2 


Planar Flow 




28.7 


34.4 


28.7 


34.4 


39.6 


42.9 


39.6 


42.9 


Pull Magnitude 




37.0 


39.0 


32.9 


35.6 


30.6 


30.2 


29.6 


30.6 


Track Count & Girth 




9.9 


20.1 


13.4 


23.2 


7.1 


17.3 


7.7* 


18.7 


R=0.3 m/pT & R=0.7 2-Point /3= 


1/5 


7.9* 


17.7 


12.2* 


22.1 


5.7 


14.4* 


8.5 


17.9 


1-Subj ,8=1/2 & R=0.7 2-Point (3 


=1/5 


8.5 


17.3* 


12.9 


22.1 


6.0 


14.6 


8.6 


17.7* 


Girth & R=0.7 2-Point ,9=1/10 




12.6 


21.9 


12.6 


21.9* 


9.2 


18.0 


9.2 


18.0 


1-Subj /3=l/2 & 3-Subj /3=1 




8.9 


18.0 


14.0 


23.2 


5.6* 


15.0 


8.4 


18.4 


Best Group of 3 




7.5 


17.0 


11.0 


20.9 


4.7 


14.0 


6.9 


16.6 


Best Group of 4 




7.1 


16.7 


10.6 


20.5 


4.5 


13.7 


6.2 


16.3 


Best Group of 5 




6.9 


16.4 


10.4 


20.0 


4.3 


13.3 


6.1 


15.9 



Table 1. Comparison of gluon efficiencies at the 50% quark acceptance working point. All of the 
single variables use R=0.5 jets, wheras combinations sometimes include R=0.7 jets. Gluon efficiencies, 
rather than gluon rejections (one minus efficiencies) , are shown because a fractional improvement here 
is the same fractional improvement in S/ B. Divided by two, it is also the fractional improvement in 
S/^/B. These scores have ±0.5% statistical errors, but they are correlated — the differences between 
variables has smaller spread, as does the improvement when combining variables. Because of the large 
number of variables and parameters, and the larger number of possible combinations of these, there is 
definitely a look-clsewhcre-typc effect when choosing the top pair. Many pairs statistically tied for the 
top spot in each category, so five pairs were chosen as representative. Their scores are marked with 
asterisks, as are the best individual variables in each category. The best groups of 3, 4, and 5 start to 
show diminishing returns. 
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